Chapter 33: Final Project

Project Overview and Selection

Project Overview

You've spent eight weeks building your recursive problem-solving toolkit. Now it's time to synthesize everything you've learned into a substantial, production-quality project. This final project is your opportunity to demonstrate mastery of:

Recursive problem decomposition: Breaking complex problems into self-similar subproblems
Pattern recognition: Identifying which recursive patterns apply to which scenarios
Optimization techniques: Memoization, tail recursion conversion, pruning strategies
Debugging methodology: Systematic diagnosis of recursive failures
Professional code quality: Clear structure, comprehensive testing, performance analysis

Project Requirements (All Options)

Regardless of which option you choose, your submission must include:

1. Core Implementation

Complete, working recursive solution to the chosen problem
Multiple recursive functions demonstrating different patterns
Proper base case handling with edge case coverage
Clear function signatures with type hints and docstrings

2. Progressive Development Documentation

Initial naive implementation showing your starting point
At least 3 iterations with documented improvements
Failure analysis for each iteration (what broke, why, how you fixed it)
Before/after comparisons with concrete evidence of improvement

3. Testing Suite

Unit tests covering base cases, recursive cases, and edge cases
Performance benchmarks comparing different approaches
Test data generators for stress testing
Failure mode demonstrations (intentionally broken versions with diagnostic output)

4. Analysis Document

Complexity analysis: Time and space complexity with recurrence relations
Call stack depth analysis: Maximum recursion depth for different inputs
Design decisions: Why you chose specific recursive patterns
Trade-off discussion: When your solution works well vs. when alternatives are better

5. Code Quality

Clean, readable code following Python conventions
Comprehensive comments explaining recursive logic
Modular design with reusable components
Error handling for invalid inputs

Evaluation Criteria

Your project will be evaluated on:

Criterion	Weight	What We're Looking For
Correctness	30%	Solution works for all test cases, handles edge cases properly
Recursive Design	25%	Appropriate use of recursion, clear problem decomposition
Code Quality	20%	Readable, well-structured, properly documented
Analysis Depth	15%	Thorough complexity analysis, insightful trade-off discussion
Testing	10%	Comprehensive test coverage, meaningful benchmarks

Time Expectations

Planning and design: 2-3 hours
Initial implementation: 4-6 hours
Iteration and refinement: 4-6 hours
Testing and documentation: 3-4 hours
Total: 15-20 hours over 2 weeks

Choosing Your Project

Read through all four options carefully. Consider:

Your interests: Which problem domain excites you?
Your strengths: Which patterns have you mastered most thoroughly?
Your goals: What do you want to demonstrate to future employers/collaborators?

Each option has been designed to showcase different aspects of recursive problem-solving. There is no "easy" or "hard" option—each requires mastery of different techniques.

Option A: Recursive Expression Evaluator with Variables

Problem Statement

Build a recursive expression evaluator that can parse and evaluate mathematical expressions with:

Arithmetic operators: +, -, *, /, ^ (exponentiation)
Operator precedence: Following standard mathematical rules
Parentheses: For grouping subexpressions
Variables: Named values that can be substituted
Functions: Built-in functions like sqrt(), abs(), max(), min()

Example expressions:

"3 + 4 * 2"           → 11
"(3 + 4) * 2"         → 14
"2 ^ 3 ^ 2"           → 512 (right-associative)
"x * 2 + y"           → Depends on variable values
"max(3, min(5, 2))"   → 3
"sqrt(x^2 + y^2)"     → Euclidean distance

Why This Project Tests Recursion Mastery

This project requires:

Recursive descent parsing: Breaking expressions into subexpressions
Tree recursion: Expression trees with multiple children
Mutual recursion: Different parsing functions calling each other
Backtracking: Trying different parse interpretations
Memoization: Caching subexpression results

Core Challenges You'll Solve

Challenge 1: Operator Precedence Without Explicit Parsing

The Problem: How do you evaluate 3 + 4 * 2 correctly without building a full parse tree first?

Recursive Insight: Process operators in precedence order, recursing for higher-precedence operations.

Challenge 2: Parentheses and Nested Expressions

The Problem: How do you handle ((3 + 4) * (5 - 2)) / 3?

Recursive Insight: When you encounter (, recursively evaluate the subexpression until ).

Challenge 3: Function Calls with Multiple Arguments

The Problem: How do you evaluate max(3, min(5, 2), 4)?

Recursive Insight: Each argument is itself an expression to be recursively evaluated.

Minimum Viable Product (MVP)

Your MVP must support:

# Basic arithmetic with precedence
evaluate("3 + 4 * 2")  # → 11

# Parentheses
evaluate("(3 + 4) * 2")  # → 14

# Variables
evaluate("x * 2 + 3", {"x": 5})  # → 13

# At least one function
evaluate("max(3, 5, 2)")  # → 5

Suggested Architecture

Here's a starting point structure (you'll refine this through iterations):

class ExpressionEvaluator:
    """Recursive expression evaluator with variable support."""

    def __init__(self, variables: dict[str, float] = None):
        self.variables = variables or {}

    def evaluate(self, expression: str) -> float:
        """Main entry point: parse and evaluate expression."""
        tokens = self.tokenize(expression)
        result, _ = self.parse_expression(tokens, 0)
        return result

    def tokenize(self, expression: str) -> list[str]:
        """Convert string to list of tokens."""
        # Your implementation
        pass

    def parse_expression(self, tokens: list[str], pos: int) -> tuple[float, int]:
        """
        Recursively parse and evaluate expression.
        Returns (result, new_position).
        """
        # Your implementation
        pass

    def parse_term(self, tokens: list[str], pos: int) -> tuple[float, int]:
        """Parse multiplication/division (higher precedence)."""
        # Your implementation
        pass

    def parse_factor(self, tokens: list[str], pos: int) -> tuple[float, int]:
        """Parse numbers, variables, parentheses, functions."""
        # Your implementation
        pass

Iteration Roadmap

Iteration 0: Numbers and Addition Only

Goal: Get basic recursion working with the simplest case.

# Should work:
evaluate("3 + 4 + 5")  # → 12
evaluate("10")         # → 10

Key Learning: Establish the recursive pattern for left-to-right evaluation.

Iteration 1: Add Multiplication and Precedence

Goal: Handle operator precedence correctly.

# Should work:
evaluate("3 + 4 * 2")    # → 11 (not 14!)
evaluate("2 * 3 + 4 * 5")  # → 26

Key Learning: Introduce mutual recursion between parse_expression() and parse_term().

Iteration 2: Add Parentheses

Goal: Handle nested subexpressions.

# Should work:
evaluate("(3 + 4) * 2")      # → 14
evaluate("((2 + 3) * 4) + 1")  # → 21

Key Learning: Recursive descent into parenthesized subexpressions.

Iteration 3: Add Variables

Goal: Support variable substitution.

# Should work:
evaluate("x + 2", {"x": 5})        # → 7
evaluate("x * y", {"x": 3, "y": 4})  # → 12

Key Learning: Variable lookup as a base case in recursion.

Iteration 4: Add Functions

Goal: Support function calls with multiple arguments.

# Should work:
evaluate("max(3, 5, 2)")           # → 5
evaluate("sqrt(x^2 + y^2)", {"x": 3, "y": 4})  # → 5.0

Key Learning: Recursively evaluate each function argument.

Iteration 5: Add Exponentiation (Right-Associative)

Goal: Handle right-associative operators correctly.

# Should work:
evaluate("2 ^ 3 ^ 2")  # → 512 (not 64!)
# Because: 2^(3^2) = 2^9 = 512

Key Learning: Right-associative recursion pattern.

Testing Strategy

Your test suite should include:

1. Base Case Tests

def test_base_cases():
    assert evaluate("42") == 42
    assert evaluate("3.14") == 3.14
    assert evaluate("-5") == -5

2. Precedence Tests

def test_precedence():
    assert evaluate("2 + 3 * 4") == 14
    assert evaluate("2 * 3 + 4") == 10
    assert evaluate("2 + 3 * 4 + 5") == 19

3. Parentheses Tests

def test_parentheses():
    assert evaluate("(2 + 3) * 4") == 20
    assert evaluate("2 * (3 + 4)") == 14
    assert evaluate("((2 + 3) * 4) + 5") == 25

4. Variable Tests

def test_variables():
    assert evaluate("x", {"x": 5}) == 5
    assert evaluate("x + y", {"x": 3, "y": 4}) == 7
    assert evaluate("2 * x + y", {"x": 3, "y": 4}) == 10

5. Function Tests

def test_functions():
    assert evaluate("max(3, 5, 2)") == 5
    assert evaluate("min(3, 5, 2)") == 2
    assert evaluate("sqrt(16)") == 4.0

6. Complex Expression Tests

def test_complex():
    assert evaluate("2 * (3 + 4) * 5") == 70
    assert evaluate("max(2 * 3, 4 + 5)") == 9
    assert evaluate("sqrt(x^2 + y^2)", {"x": 3, "y": 4}) == 5.0

7. Edge Case Tests

def test_edge_cases():
    assert evaluate("0") == 0
    assert evaluate("1 + 0") == 1
    assert evaluate("5 - 5") == 0

    # Should raise errors:
    with pytest.raises(ValueError):
        evaluate("x")  # Undefined variable

    with pytest.raises(ValueError):
        evaluate("1 / 0")  # Division by zero

    with pytest.raises(ValueError):
        evaluate("(3 + 4")  # Unmatched parenthesis

Performance Benchmarks

Compare your recursive evaluator against Python's eval():

import timeit

expressions = [
    "3 + 4 * 2",
    "(3 + 4) * (5 - 2)",
    "2 ^ 3 ^ 2",
    "max(3, min(5, 2), 4)",
]

for expr in expressions:
    recursive_time = timeit.timeit(
        lambda: evaluate(expr),
        number=10000
    )

    eval_time = timeit.timeit(
        lambda: eval(expr.replace('^', '**')),
        number=10000
    )

    print(f"{expr:30} | Recursive: {recursive_time:.4f}s | eval(): {eval_time:.4f}s")

Complexity Analysis

Analyze your solution:

Time Complexity: O(n) where n is the number of tokens - Each token is processed exactly once - Recursive calls don't revisit tokens

Space Complexity: O(d) where d is the maximum nesting depth - Call stack depth equals maximum parenthesis nesting - Each recursive call stores constant data

Recurrence Relation:

T(n) = T(n-1) + O(1)  for linear expressions
T(n) = T(n/2) + T(n/2) + O(1)  for balanced parentheses

Extension Ideas

Once your MVP works, consider adding:

More operators: Modulo (%), integer division (//)
More functions: sin(), cos(), log(), exp()
Comparison operators: <, >, ==, returning 1.0 or 0.0
Logical operators: and, or, not
Conditional expressions: if(condition, true_value, false_value)
Error messages: Detailed parse error reporting with position
Optimization: Constant folding, common subexpression elimination

Deliverables Checklist

[ ] Core evaluator supporting all MVP features
[ ] At least 5 documented iterations showing progression
[ ] Comprehensive test suite (30+ tests)
[ ] Performance benchmarks comparing to eval()
[ ] Complexity analysis with recurrence relations
[ ] README explaining design decisions
[ ] Example usage demonstrating all features

Option B: File System Analyzer with Recursive Statistics

Problem Statement

Build a recursive file system analyzer that traverses directory trees and computes comprehensive statistics:

Size analysis: Total size, size by file type, largest files
File counting: Total files, files by extension, empty files
Directory depth: Maximum depth, average depth, deepest paths
Pattern matching: Find files matching patterns, duplicate detection
Time analysis: Oldest/newest files, files modified in date ranges
Visualization: Generate tree representations, size distribution charts

Example output:

Analyzing: /home/user/projects/

Total size: 2.4 GB
Total files: 1,247
Total directories: 156
Maximum depth: 8 levels

Size by type:
  .py:   450 MB (18.8%)
  .js:   380 MB (15.8%)
  .json: 120 MB (5.0%)
  .md:    45 MB (1.9%)
  Other: 1.4 GB (58.5%)

Largest files:
  1. node_modules/package/dist/bundle.js (45 MB)
  2. data/training_set.csv (38 MB)
  3. .git/objects/pack/pack-abc123.pack (32 MB)

Deepest path (8 levels):
  src/components/dashboard/widgets/charts/line/config/defaults.json

Why This Project Tests Recursion Mastery

This project requires:

Tree recursion: Navigating arbitrary-depth directory structures
Accumulator pattern: Collecting statistics across recursive calls
Multiple recursive calls: Processing all subdirectories at each level
Backtracking: Exploring all paths in the tree
Memoization: Caching directory statistics for repeated queries

Core Challenges You'll Solve

Challenge 1: Arbitrary Depth Traversal

The Problem: How do you process a directory tree when you don't know how deep it goes?

Recursive Insight: Each directory is processed the same way—list its contents, recurse on subdirectories.

Challenge 2: Aggregating Statistics Across Branches

The Problem: How do you combine file counts and sizes from multiple subdirectories?

Recursive Insight: Each recursive call returns statistics; parent combines them.

Challenge 3: Handling Symbolic Links and Permissions

The Problem: What happens when you encounter a symlink loop or a directory you can't read?

Recursive Insight: Base cases for error conditions prevent infinite recursion.

Minimum Viable Product (MVP)

Your MVP must support:

# Basic directory analysis
stats = analyze_directory("/path/to/dir")
print(f"Total size: {stats.total_size} bytes")
print(f"Total files: {stats.file_count}")
print(f"Max depth: {stats.max_depth}")

# Size by extension
for ext, size in stats.size_by_extension.items():
    print(f"{ext}: {size} bytes")

# Find largest files
for path, size in stats.largest_files(n=10):
    print(f"{path}: {size} bytes")

# Find files matching pattern
matches = find_files("/path/to/dir", pattern="*.py")
for path in matches:
    print(path)

Suggested Architecture

Here's a starting point structure (you'll refine this through iterations):

from dataclasses import dataclass, field
from pathlib import Path
from typing import Dict, List, Tuple
import os

@dataclass
class DirectoryStats:
    """Statistics for a directory and its contents."""
    path: Path
    total_size: int = 0
    file_count: int = 0
    dir_count: int = 0
    max_depth: int = 0
    size_by_extension: Dict[str, int] = field(default_factory=dict)
    largest_files: List[Tuple[Path, int]] = field(default_factory=list)

    def merge(self, other: 'DirectoryStats') -> None:
        """Merge statistics from a subdirectory."""
        self.total_size += other.total_size
        self.file_count += other.file_count
        self.dir_count += other.dir_count
        self.max_depth = max(self.max_depth, other.max_depth + 1)

        # Merge extension sizes
        for ext, size in other.size_by_extension.items():
            self.size_by_extension[ext] = self.size_by_extension.get(ext, 0) + size

        # Merge largest files
        self.largest_files.extend(other.largest_files)
        self.largest_files.sort(key=lambda x: x[1], reverse=True)
        self.largest_files = self.largest_files[:100]  # Keep top 100

class FileSystemAnalyzer:
    """Recursive file system analyzer."""

    def __init__(self, follow_symlinks: bool = False):
        self.follow_symlinks = follow_symlinks
        self.visited_inodes = set()  # Prevent symlink loops

    def analyze_directory(self, path: Path) -> DirectoryStats:
        """
        Recursively analyze directory and return statistics.

        Base cases:
        - Path doesn't exist
        - Path is a file (not directory)
        - Permission denied
        - Symlink loop detected

        Recursive case:
        - Process each subdirectory
        - Merge their statistics
        """
        # Your implementation
        pass

    def find_files(self, path: Path, pattern: str) -> List[Path]:
        """
        Recursively find all files matching pattern.

        Returns list of matching file paths.
        """
        # Your implementation
        pass

    def find_duplicates(self, path: Path) -> Dict[int, List[Path]]:
        """
        Find duplicate files by size and content hash.

        Returns dict mapping file hash to list of paths.
        """
        # Your implementation
        pass

Iteration Roadmap

Iteration 0: Single Directory (No Recursion)

Goal: Process files in one directory only.

# Should work:
stats = analyze_directory("/path/to/flat/dir")
print(stats.file_count)  # Count of files in this directory only
print(stats.total_size)  # Sum of file sizes in this directory only

Key Learning: Establish the base case—processing a single directory's contents.

Iteration 1: Add Recursive Subdirectory Traversal

Goal: Recursively process all subdirectories.

# Should work:
stats = analyze_directory("/path/to/nested/dir")
print(stats.file_count)  # Count of ALL files in tree
print(stats.dir_count)   # Count of ALL subdirectories

Key Learning: Recursive descent into subdirectories, merging statistics.

Expected Failure Mode: Stack overflow on very deep directories.

Diagnostic Analysis:

RecursionError: maximum recursion depth exceeded while calling a Python object
  File "analyzer.py", line 45, in analyze_directory
    subdir_stats = self.analyze_directory(subdir_path)
  File "analyzer.py", line 45, in analyze_directory
    subdir_stats = self.analyze_directory(subdir_path)
  [Previous line repeated 996 more times]

Root Cause: No depth limit, no symlink loop detection.

Solution: Add depth tracking and visited inode tracking.

Iteration 2: Add Depth Tracking and Loop Prevention

Goal: Handle deep directories and symlink loops safely.

# Should work:
stats = analyze_directory("/path/with/symlinks", max_depth=10)
print(stats.max_depth)  # Maximum depth reached

# Should not crash:
stats = analyze_directory("/path/with/symlink/loop")
# Detects loop, skips revisited directories

Key Learning: Base cases for depth limits and loop detection.

Iteration 3: Add Extension-Based Statistics

Goal: Track file sizes by extension.

# Should work:
stats = analyze_directory("/path/to/project")
for ext, size in stats.size_by_extension.items():
    print(f"{ext}: {size} bytes")

# Output:
# .py: 1234567 bytes
# .js: 987654 bytes
# .json: 456789 bytes

Key Learning: Accumulating categorized data across recursive calls.

Iteration 4: Add Pattern Matching

Goal: Find files matching glob patterns.

# Should work:
python_files = find_files("/path/to/project", "*.py")
test_files = find_files("/path/to/project", "**/test_*.py")
config_files = find_files("/path/to/project", "**/*.{json,yaml}")

Key Learning: Recursive search with filtering.

Iteration 5: Add Duplicate Detection

Goal: Find duplicate files by content.

# Should work:
duplicates = find_duplicates("/path/to/project")
for file_hash, paths in duplicates.items():
    if len(paths) > 1:
        print(f"Duplicates found ({len(paths)} copies):")
        for path in paths:
            print(f"  {path}")

Key Learning: Recursive collection with post-processing.

Testing Strategy

Your test suite should include:

1. Base Case Tests

Create test directory structures:

import tempfile
import os

def create_test_structure():
    """Create a known directory structure for testing."""
    tmpdir = tempfile.mkdtemp()

    # Create structure:
    # tmpdir/
    #   file1.txt (100 bytes)
    #   file2.py (200 bytes)
    #   subdir1/
    #     file3.txt (150 bytes)
    #     subdir2/
    #       file4.py (250 bytes)

    os.makedirs(os.path.join(tmpdir, "subdir1", "subdir2"))

    with open(os.path.join(tmpdir, "file1.txt"), "w") as f:
        f.write("x" * 100)

    with open(os.path.join(tmpdir, "file2.py"), "w") as f:
        f.write("x" * 200)

    with open(os.path.join(tmpdir, "subdir1", "file3.txt"), "w") as f:
        f.write("x" * 150)

    with open(os.path.join(tmpdir, "subdir1", "subdir2", "file4.py"), "w") as f:
        f.write("x" * 250)

    return tmpdir

def test_base_cases():
    tmpdir = create_test_structure()

    stats = analyze_directory(tmpdir)

    assert stats.file_count == 4
    assert stats.dir_count == 2
    assert stats.total_size == 700  # 100 + 200 + 150 + 250
    assert stats.max_depth == 2

2. Extension Statistics Tests

def test_extension_stats():
    tmpdir = create_test_structure()
    stats = analyze_directory(tmpdir)

    assert stats.size_by_extension[".txt"] == 250  # file1 + file3
    assert stats.size_by_extension[".py"] == 450   # file2 + file4

3. Pattern Matching Tests

def test_pattern_matching():
    tmpdir = create_test_structure()

    txt_files = find_files(tmpdir, "*.txt")
    assert len(txt_files) == 2

    py_files = find_files(tmpdir, "**/*.py")
    assert len(py_files) == 2

4. Symlink Loop Tests

def test_symlink_loop():
    tmpdir = tempfile.mkdtemp()

    # Create symlink loop:
    # tmpdir/subdir -> tmpdir
    subdir = os.path.join(tmpdir, "subdir")
    os.symlink(tmpdir, subdir)

    # Should not crash
    stats = analyze_directory(tmpdir, follow_symlinks=True)

    # Should detect loop and skip
    assert stats.dir_count == 1  # Only the real subdir

5. Permission Error Tests

def test_permission_errors():
    tmpdir = tempfile.mkdtemp()

    # Create unreadable directory
    unreadable = os.path.join(tmpdir, "unreadable")
    os.makedirs(unreadable)
    os.chmod(unreadable, 0o000)

    # Should not crash
    stats = analyze_directory(tmpdir)

    # Should skip unreadable directory
    assert stats.dir_count == 0

    # Cleanup
    os.chmod(unreadable, 0o755)

6. Large Directory Tests

def test_large_directory():
    """Test performance on large directory structures."""
    tmpdir = create_large_test_structure(
        depth=5,
        files_per_dir=10,
        subdirs_per_dir=3
    )

    import time
    start = time.time()
    stats = analyze_directory(tmpdir)
    elapsed = time.time() - start

    print(f"Analyzed {stats.file_count} files in {elapsed:.2f}s")
    assert elapsed < 5.0  # Should complete in reasonable time

Performance Benchmarks

Compare different approaches:

import time

def benchmark_approaches(test_dir):
    """Compare recursive vs iterative approaches."""

    # Recursive approach
    start = time.time()
    recursive_stats = analyze_directory_recursive(test_dir)
    recursive_time = time.time() - start

    # Iterative approach (using os.walk)
    start = time.time()
    iterative_stats = analyze_directory_iterative(test_dir)
    iterative_time = time.time() - start

    print(f"Recursive: {recursive_time:.4f}s")
    print(f"Iterative: {iterative_time:.4f}s")
    print(f"Speedup: {recursive_time / iterative_time:.2f}x")

    # Verify same results
    assert recursive_stats.file_count == iterative_stats.file_count
    assert recursive_stats.total_size == iterative_stats.total_size

Complexity Analysis

Analyze your solution:

Time Complexity: O(n) where n is the total number of files and directories - Each file/directory is visited exactly once - Constant work per item (stat, categorize)

Space Complexity: O(d) where d is the maximum directory depth - Call stack depth equals maximum nesting level - Each recursive call stores directory statistics

Recurrence Relation:

T(n) = k * T(n/k) + O(n)  where k is average subdirectories per directory

For balanced trees: T(n) = O(n log n)
For linear chains: T(n) = O(n)

Call Stack Depth:

Worst case: O(n) for linear directory chain
Average case: O(log n) for balanced tree
Practical limit: Python's recursion limit (default 1000)

Visualization Examples

Generate visual representations:

1. Directory Tree

def print_tree(path: Path, prefix: str = "", max_depth: int = 3):
    """Recursively print directory tree structure."""
    if max_depth == 0:
        return

    entries = sorted(path.iterdir())

    for i, entry in enumerate(entries):
        is_last = (i == len(entries) - 1)
        connector = "└── " if is_last else "├── "
        print(f"{prefix}{connector}{entry.name}")

        if entry.is_dir():
            extension = "    " if is_last else "│   "
            print_tree(entry, prefix + extension, max_depth - 1)

# Output:
# project/
# ├── src/
# │   ├── main.py
# │   └── utils.py
# ├── tests/
# │   └── test_main.py
# └── README.md

2. Size Distribution

def print_size_distribution(stats: DirectoryStats):
    """Print size distribution by extension."""
    total = stats.total_size

    sorted_exts = sorted(
        stats.size_by_extension.items(),
        key=lambda x: x[1],
        reverse=True
    )

    for ext, size in sorted_exts[:10]:
        percentage = (size / total) * 100
        bar_length = int(percentage / 2)
        bar = "█" * bar_length
        print(f"{ext:10} {bar:50} {percentage:5.1f}% ({size:,} bytes)")

# Output:
# .py        ████████████████████                              18.8% (450,000 bytes)
# .js        ███████████████                                   15.8% (380,000 bytes)
# .json      ██                                                 5.0% (120,000 bytes)

Extension Ideas

Once your MVP works, consider adding:

Time-based analysis: Files modified in last N days, oldest files
Content analysis: Line counts for code files, comment ratios
Git integration: Ignore files in .gitignore, show uncommitted changes
Compression analysis: Estimate compression ratios, find compressible files
Security scanning: Find files with dangerous permissions, exposed secrets
Comparison mode: Compare two directory trees, show differences
Export formats: JSON, CSV, HTML reports
Interactive mode: TUI for exploring directory statistics

Deliverables Checklist

[ ] Core analyzer supporting all MVP features
[ ] At least 5 documented iterations showing progression
[ ] Comprehensive test suite with synthetic directory structures
[ ] Performance benchmarks comparing recursive vs iterative
[ ] Complexity analysis with call stack depth analysis
[ ] Visualization functions (tree, size distribution)
[ ] README with usage examples and design decisions
[ ] Error handling for permissions, symlinks, edge cases

Option C: Recursive Board Game Solver (Minimax)

Option C: Recursive Board Game Solver (Connect Four, Tic-Tac-Toe with Minimax)

Problem Statement

Build a recursive game-playing AI using the minimax algorithm with alpha-beta pruning. Your solver should:

Play perfectly: Make optimal moves assuming perfect opponent play
Evaluate positions: Assign scores to board states
Search game trees: Explore possible move sequences recursively
Optimize search: Use alpha-beta pruning to skip irrelevant branches
Handle complexity: Work with games of different complexity levels

Supported games: - Tic-Tac-Toe: 3×3 grid, simple game tree (~9! = 362,880 positions) - Connect Four: 7×6 grid, complex game tree (~10^13 positions)

Example interaction:

Current board:
X | O | X
---------
O | X | 
---------
  |   | O

AI thinking... (explored 1,247 positions)
AI plays: position (2, 1)

X | O | X
---------
O | X | 
---------
  | X | O

AI evaluation: +8 (AI winning)
Best move sequence: (2,1) → (0,2) → (2,0) → AI wins

Why This Project Tests Recursion Mastery

This project requires:

Tree recursion: Exploring all possible move sequences
Minimax pattern: Alternating between maximizing and minimizing players
Backtracking: Trying moves, recursing, then undoing moves
Pruning: Skipping branches that can't affect the outcome
Memoization: Caching board evaluations (transposition tables)

Core Challenges You'll Solve

Challenge 1: Exponential Search Space

The Problem: Tic-Tac-Toe has ~362,880 positions. Connect Four has ~10^13. How do you search efficiently?

Recursive Insight: Minimax explores the tree systematically; alpha-beta pruning eliminates irrelevant branches.

Challenge 2: Alternating Perspectives

The Problem: How do you model that each player wants opposite outcomes?

Recursive Insight: Recursive calls alternate between maximizing (AI) and minimizing (opponent) perspectives.

Challenge 3: Depth-Limited Search

The Problem: Connect Four's full game tree is too large to search completely.

Recursive Insight: Limit recursion depth and use heuristic evaluation for non-terminal positions.

Minimum Viable Product (MVP)

Your MVP must support:

# Create game
game = TicTacToe()

# AI makes optimal move
best_move = game.find_best_move(player='X')
game.make_move(best_move, player='X')

# Evaluate position
score = game.evaluate()
print(f"Position score: {score}")

# Get move explanation
explanation = game.explain_move(best_move)
print(f"Why this move: {explanation}")

# Play full game
game.play_ai_vs_ai()

Suggested Architecture

Here's a starting point structure (you'll refine this through iterations):

from dataclasses import dataclass
from typing import List, Tuple, Optional
from enum import Enum

class Player(Enum):
    X = 1
    O = -1
    EMPTY = 0

@dataclass
class GameState:
    """Represents a board position."""
    board: List[List[Player]]
    current_player: Player

    def is_terminal(self) -> bool:
        """Check if game is over."""
        pass

    def get_winner(self) -> Optional[Player]:
        """Return winner if game is over, else None."""
        pass

    def get_legal_moves(self) -> List[Tuple[int, int]]:
        """Return list of legal moves."""
        pass

    def make_move(self, move: Tuple[int, int]) -> 'GameState':
        """Return new state after making move."""
        pass

class MinimaxSolver:
    """Recursive minimax solver with alpha-beta pruning."""

    def __init__(self, max_depth: int = None):
        self.max_depth = max_depth
        self.nodes_explored = 0
        self.cache = {}  # Transposition table

    def find_best_move(self, state: GameState) -> Tuple[int, int]:
        """
        Find optimal move using minimax algorithm.

        Returns (row, col) of best move.
        """
        self.nodes_explored = 0
        best_score = float('-inf')
        best_move = None

        for move in state.get_legal_moves():
            new_state = state.make_move(move)
            score = self.minimax(
                new_state,
                depth=0,
                alpha=float('-inf'),
                beta=float('inf'),
                maximizing=False
            )

            if score > best_score:
                best_score = score
                best_move = move

        return best_move

    def minimax(
        self,
        state: GameState,
        depth: int,
        alpha: float,
        beta: float,
        maximizing: bool
    ) -> float:
        """
        Recursive minimax with alpha-beta pruning.

        Base cases:
        - Game is over (terminal state)
        - Maximum depth reached
        - Position in cache

        Recursive case:
        - Try all legal moves
        - Recurse on resulting states
        - Return best score for current player
        """
        self.nodes_explored += 1

        # Your implementation
        pass

    def evaluate(self, state: GameState) -> float:
        """
        Evaluate non-terminal position heuristically.

        Returns score from maximizing player's perspective.
        """
        # Your implementation
        pass

class TicTacToe:
    """Tic-Tac-Toe game with AI solver."""

    def __init__(self):
        self.state = GameState(
            board=[[Player.EMPTY] * 3 for _ in range(3)],
            current_player=Player.X
        )
        self.solver = MinimaxSolver()

    def play_ai_vs_ai(self):
        """Play complete game with AI vs AI."""
        while not self.state.is_terminal():
            self.print_board()
            move = self.solver.find_best_move(self.state)
            print(f"{self.state.current_player.name} plays: {move}")
            self.state = self.state.make_move(move)

        self.print_board()
        winner = self.state.get_winner()
        if winner:
            print(f"{winner.name} wins!")
        else:
            print("Draw!")

Iteration Roadmap

Iteration 0: Basic Minimax (No Pruning)

Goal: Implement naive minimax that explores entire game tree.

# Should work:
game = TicTacToe()
best_move = game.find_best_move()
print(f"Best move: {best_move}")
print(f"Nodes explored: {game.solver.nodes_explored}")

# Expected output:
# Best move: (1, 1)  # Center is optimal first move
# Nodes explored: 549,946  # Full tree exploration

Key Learning: Establish the recursive minimax pattern.

Expected Performance: Slow for Tic-Tac-Toe, unusable for Connect Four.

Iteration 1: Add Alpha-Beta Pruning

Goal: Optimize search by pruning irrelevant branches.

# Should work:
game = TicTacToe()
best_move = game.find_best_move()
print(f"Nodes explored: {game.solver.nodes_explored}")

# Expected output:
# Nodes explored: 18,297  # ~30x reduction!

Key Learning: Pruning dramatically reduces search space.

Before/After Comparison:

**Without Alpha-Beta Pruning**:

Exploring move (0, 0)... Exploring opponent move (0, 1)... Exploring move (0, 2)... [... explores all 362,880 positions ...] Nodes explored: 549,946 Time: 2.3 seconds

**With Alpha-Beta Pruning**:

Exploring move (0, 0)... Exploring opponent move (0, 1)... Exploring move (0, 2)... Score: -10 (loss) [Pruning remaining moves - can't improve] Nodes explored: 18,297 Time: 0.08 seconds Speedup: 28.75x

**Diagnostic Analysis**:

The pruning works because:
1. We found a move leading to a loss (score -10)
2. We're minimizing (opponent's turn)
3. Any worse move (score < -10) won't be chosen by the opponent
4. Therefore, we can skip exploring those branches

**Call Stack Visualization** (simplified):

minimax(state, depth=0, alpha=-∞, beta=+∞, max=True) ↓ Try move (0,0) ↓ minimax(state', depth=1, alpha=-∞, beta=+∞, max=False) ↓ Try move (0,1) ↓ minimax(state'', depth=2, alpha=-∞, beta=+∞, max=True) ↓ Try move (0,2) ↓ minimax(state''', depth=3, alpha=-∞, beta=+∞, max=False) ↓ Score: -10 (opponent wins) ↑ Update beta = -10 ↓ Try move (1,0) ↓ alpha=-∞ >= beta=-10? No, continue ↓ Try move (1,1) ↓ alpha=-∞ >= beta=-10? No, continue [... more moves ...] ↑ Return best score for minimizer ↑ Update alpha ↓ Try move (0,1) ↓ alpha >= beta? Check for pruning [... continues ...]

#### Iteration 2: Add Move Ordering

**Goal**: Explore better moves first to maximize pruning.

```python
# Should work:
game = TicTacToe()
best_move = game.find_best_move()
print(f"Nodes explored: {game.solver.nodes_explored}")

# Expected output:
# Nodes explored: 7,842  # Further reduction!

Key Learning: Move ordering dramatically improves pruning effectiveness.

Move Ordering Heuristics: 1. Center moves first: Center is often strongest in Tic-Tac-Toe 2. Winning moves first: Check for immediate wins 3. Blocking moves second: Check for opponent wins to block 4. Corner moves before edges: Corners are generally stronger

Before/After Comparison:

**Without Move Ordering** (random order):

Trying moves in order: (2,1), (0,0), (1,2), (0,1), ... Nodes explored: 18,297 Pruning efficiency: 30%

**With Move Ordering** (strategic order):

Trying moves in order: (1,1), (0,0), (0,2), (2,0), (2,2), ... ^center ^corners ^edges Nodes explored: 7,842 Pruning efficiency: 65% Improvement: 2.3x fewer nodes

**Why This Works**:

When we explore good moves first:
1. We find strong positions early
2. This sets tight alpha/beta bounds
3. Weak moves get pruned immediately
4. We skip exploring obviously bad branches

#### Iteration 3: Add Transposition Table (Memoization)

**Goal**: Cache board evaluations to avoid recomputing.

```python
# Should work:
game = TicTacToe()
best_move = game.find_best_move()
print(f"Nodes explored: {game.solver.nodes_explored}")
print(f"Cache hits: {game.solver.cache_hits}")

# Expected output:
# Nodes explored: 7,842
# Cache hits: 3,421  # ~44% of lookups hit cache

Key Learning: Many board positions are reached via different move orders.

Transposition Example:

**Same Position, Different Paths**:

Path 1: X plays (0,0) → O plays (1,1) → X plays (0,1)
Path 2: X plays (0,1) → O plays (1,1) → X plays (0,0)

Both lead to:

X | X |

| O |

| |

Without cache: Evaluate this position twice
With cache: Evaluate once, reuse result

Cache Key Design:

def board_to_key(board: List[List[Player]]) -> str:
    """Convert board to hashable cache key."""
    return ''.join(
        str(cell.value) for row in board for cell in row
    )

# Example:
# Board:     Cache key:
# X | O |    "1-10000000"
# -----      (X=1, O=-1, EMPTY=0)
#   |   |
# -----
#   |   |

Iteration 4: Add Depth-Limited Search

Goal: Support larger games by limiting search depth.

# Should work:
game = ConnectFour()
best_move = game.find_best_move(max_depth=6)
print(f"Nodes explored: {game.solver.nodes_explored}")

# Expected output:
# Nodes explored: 45,231  # Manageable for depth 6
# Evaluation: Heuristic (non-terminal)

Key Learning: Depth limits make complex games tractable.

Depth-Limited Evaluation:

def minimax(self, state, depth, alpha, beta, maximizing):
    # Base case 1: Terminal state
    if state.is_terminal():
        return self.evaluate_terminal(state)

    # Base case 2: Depth limit reached
    if depth >= self.max_depth:
        return self.evaluate_heuristic(state)  # ← Heuristic evaluation

    # Recursive case: Continue search
    # ...

Heuristic Evaluation for Connect Four:

def evaluate_heuristic(self, state: GameState) -> float:
    """
    Evaluate non-terminal Connect Four position.

    Heuristics:
    - Count potential winning lines (3-in-a-row with empty 4th)
    - Penalize opponent's potential wins
    - Favor center columns
    - Reward connected pieces
    """
    score = 0

    # Count 3-in-a-row with empty 4th
    for line in self.get_all_lines(state.board):
        if self.count_pieces(line, Player.X) == 3 and \
           self.count_pieces(line, Player.EMPTY) == 1:
            score += 5

        if self.count_pieces(line, Player.O) == 3 and \
           self.count_pieces(line, Player.EMPTY) == 1:
            score -= 5

    # Favor center columns
    center_col = state.board[len(state.board[0]) // 2]
    score += self.count_pieces(center_col, Player.X) * 3
    score -= self.count_pieces(center_col, Player.O) * 3

    return score

Iteration 5: Add Iterative Deepening

Goal: Progressively deepen search to find best move within time limit.

# Should work:
game = ConnectFour()
best_move = game.find_best_move(time_limit=5.0)  # 5 seconds
print(f"Search depth reached: {game.solver.depth_reached}")
print(f"Nodes explored: {game.solver.nodes_explored}")

# Expected output:
# Search depth reached: 8
# Nodes explored: 234,567
# Time: 4.87 seconds

Key Learning: Iterative deepening provides anytime algorithm behavior.

Iterative Deepening Pattern:

def find_best_move_iterative(self, state: GameState, time_limit: float):
    """
    Iteratively deepen search until time limit.

    Depth 1: Quick evaluation
    Depth 2: Better evaluation
    Depth 3: Even better
    ...
    Until time runs out
    """
    import time
    start_time = time.time()

    best_move = None
    depth = 1

    while time.time() - start_time < time_limit:
        self.max_depth = depth
        current_best = self.find_best_move(state)

        if current_best:
            best_move = current_best

        depth += 1

        # Stop if we searched to terminal depth
        if self.fully_searched:
            break

    return best_move

Testing Strategy

Your test suite should include:

1. Correctness Tests

def test_perfect_play():
    """AI should never lose when playing optimally."""
    game = TicTacToe()

    # Play 100 games AI vs AI
    results = {"X": 0, "O": 0, "Draw": 0}

    for _ in range(100):
        game.reset()
        winner = game.play_ai_vs_ai()
        results[winner] += 1

    # With perfect play, all games should be draws
    assert results["Draw"] == 100

2. Pruning Effectiveness Tests

def test_pruning_effectiveness():
    """Alpha-beta should explore far fewer nodes."""
    game = TicTacToe()

    # Without pruning
    solver_no_prune = MinimaxSolver(use_pruning=False)
    solver_no_prune.find_best_move(game.state)
    nodes_no_prune = solver_no_prune.nodes_explored

    # With pruning
    solver_prune = MinimaxSolver(use_pruning=True)
    solver_prune.find_best_move(game.state)
    nodes_prune = solver_prune.nodes_explored

    # Should explore at least 10x fewer nodes
    assert nodes_prune < nodes_no_prune / 10

3. Cache Effectiveness Tests

def test_cache_effectiveness():
    """Transposition table should reduce redundant computation."""
    game = TicTacToe()

    # With cache
    solver_cache = MinimaxSolver(use_cache=True)
    solver_cache.find_best_move(game.state)
    nodes_cache = solver_cache.nodes_explored
    hits_cache = solver_cache.cache_hits

    # Cache hit rate should be > 30%
    hit_rate = hits_cache / nodes_cache
    assert hit_rate > 0.3

4. Depth Limit Tests

def test_depth_limit():
    """Depth limit should control search size."""
    game = ConnectFour()

    nodes_by_depth = {}

    for depth in [2, 4, 6, 8]:
        solver = MinimaxSolver(max_depth=depth)
        solver.find_best_move(game.state)
        nodes_by_depth[depth] = solver.nodes_explored

    # Nodes should increase with depth
    assert nodes_by_depth[2] < nodes_by_depth[4]
    assert nodes_by_depth[4] < nodes_by_depth[6]
    assert nodes_by_depth[6] < nodes_by_depth[8]

5. Known Position Tests

def test_known_positions():
    """Test against known optimal moves."""
    game = TicTacToe()

    # Empty board: center is optimal
    best_move = game.find_best_move()
    assert best_move == (1, 1)

    # Winning move available
    game.state.board = [
        [Player.X, Player.X, Player.EMPTY],
        [Player.O, Player.O, Player.EMPTY],
        [Player.EMPTY, Player.EMPTY, Player.EMPTY]
    ]
    best_move = game.find_best_move()
    assert best_move == (0, 2)  # Win immediately

    # Blocking move required
    game.state.board = [
        [Player.X, Player.EMPTY, Player.EMPTY],
        [Player.O, Player.O, Player.EMPTY],
        [Player.EMPTY, Player.EMPTY, Player.EMPTY]
    ]
    best_move = game.find_best_move()
    assert best_move == (1, 2)  # Block opponent win

Performance Benchmarks

Compare different optimizations:

def benchmark_optimizations():
    """Compare performance of different optimization techniques."""
    game = TicTacToe()

    configs = [
        ("Naive minimax", {"pruning": False, "cache": False, "ordering": False}),
        ("+ Alpha-beta", {"pruning": True, "cache": False, "ordering": False}),
        ("+ Move ordering", {"pruning": True, "cache": False, "ordering": True}),
        ("+ Transposition", {"pruning": True, "cache": True, "ordering": True}),
    ]

    results = []

    for name, config in configs:
        solver = MinimaxSolver(**config)

        start = time.time()
        solver.find_best_move(game.state)
        elapsed = time.time() - start

        results.append({
            "name": name,
            "time": elapsed,
            "nodes": solver.nodes_explored,
            "cache_hits": solver.cache_hits if config["cache"] else 0
        })

    # Print comparison table
    print(f"{'Configuration':<20} {'Time (s)':<10} {'Nodes':<10} {'Cache Hits':<12}")
    print("-" * 60)

    for r in results:
        print(f"{r['name']:<20} {r['time']:<10.4f} {r['nodes']:<10} {r['cache_hits']:<12}")

    # Expected output:
    # Configuration        Time (s)   Nodes      Cache Hits  
    # ------------------------------------------------------------
    # Naive minimax        2.3456     549946     0           
    # + Alpha-beta         0.0823     18297      0           
    # + Move ordering      0.0354     7842       0           
    # + Transposition      0.0198     7842       3421

Complexity Analysis

Analyze your solution:

Time Complexity (without pruning): O(b^d) - b = branching factor (average legal moves per position) - d = depth of game tree - Tic-Tac-Toe: b ≈ 5, d ≈ 9 → ~2 million nodes - Connect Four: b ≈ 7, d ≈ 42 → ~10^35 nodes (intractable!)

Time Complexity (with alpha-beta pruning): O(b^(d/2)) in best case - Effective branching factor reduced by square root - Tic-Tac-Toe: ~18,000 nodes (30x reduction) - Connect Four: Still large, requires depth limiting

Space Complexity: O(d) - Call stack depth equals search depth - Each recursive call stores board state and alpha/beta values - Transposition table adds O(n) space where n is unique positions

Recurrence Relation:

T(d) = b * T(d-1) + O(1)  without pruning
T(d) = sqrt(b) * T(d-1) + O(1)  with optimal pruning

Solves to:
T(d) = O(b^d)  without pruning
T(d) = O(b^(d/2))  with pruning

Call Stack Depth:

Maximum depth: d (depth of game tree)
Tic-Tac-Toe: d ≤ 9
Connect Four: d ≤ 42
With depth limiting: d = max_depth parameter

Visualization Examples

Generate visual representations:

1. Game Tree Visualization

def visualize_game_tree(state: GameState, depth: int = 3):
    """Print game tree up to specified depth."""
    def print_tree(state, depth, prefix="", is_last=True):
        if depth == 0:
            return

        connector = "└── " if is_last else "├── "
        score = evaluate(state)
        print(f"{prefix}{connector}Score: {score}")

        moves = state.get_legal_moves()
        for i, move in enumerate(moves):
            new_state = state.make_move(move)
            extension = "    " if is_last else "│   "
            print_tree(new_state, depth - 1, prefix + extension, i == len(moves) - 1)

    print_tree(state, depth)

# Output:
# Score: 0
# ├── Score: -10 (opponent wins)
# ├── Score: 0 (draw)
# │   ├── Score: -10
# │   └── Score: 0
# └── Score: +10 (we win)

2. Search Progress Visualization

def visualize_search_progress(solver: MinimaxSolver):
    """Show search progress in real-time."""
    import sys

    def progress_callback(depth, nodes, best_score):
        sys.stdout.write(f"\rDepth: {depth} | Nodes: {nodes:,} | Best: {best_score:+.1f}")
        sys.stdout.flush()

    solver.set_progress_callback(progress_callback)
    solver.find_best_move(state)
    print()  # Newline after progress

# Output:
# Depth: 6 | Nodes: 45,231 | Best: +3.5

Extension Ideas

Once your MVP works, consider adding:

More games: Checkers, Othello, Chess (with piece-square tables)
Opening book: Pre-computed optimal moves for early game
Endgame tablebase: Perfect play for positions with few pieces
Monte Carlo Tree Search: Alternative to minimax for very large games
Neural network evaluation: Learn evaluation function from data
Parallel search: Multi-threaded tree exploration
Interactive GUI: Visual board with click-to-move interface
Game analysis: Annotate games with move quality scores

Deliverables Checklist

[ ] Core minimax solver with alpha-beta pruning
[ ] At least 5 documented iterations showing progression
[ ] Support for both Tic-Tac-Toe and Connect Four
[ ] Comprehensive test suite (correctness, performance, known positions)
[ ] Performance benchmarks comparing optimizations
[ ] Complexity analysis with game tree size calculations
[ ] Visualization of game tree and search progress
[ ] README with strategy explanation and design decisions
[ ] AI vs AI gameplay demonstration

Option D: Custom Challenge

Option D: Custom Challenge (Instructor Approval Required)

Overview

If none of the predefined projects align with your interests or goals, you may propose a custom project that demonstrates mastery of recursive problem-solving. This option gives you creative freedom while ensuring you meet the course's learning objectives.

Approval Process

Step 1: Submit Proposal (Due: End of Week 7)

Your proposal must include:

Problem statement: Clear description of what you're building
Recursive components: Identify at least 3 distinct recursive patterns you'll use
Complexity justification: Explain why this problem requires recursion
Scope definition: MVP features and stretch goals
Success criteria: How you'll demonstrate correctness and performance

Step 2: Instructor Review (Within 48 hours)

Instructor will evaluate: - Does the project require substantial recursive problem-solving? - Is the scope appropriate (15-20 hours of work)? - Are the learning objectives clearly addressed? - Is the project feasible within the timeframe?

Step 3: Approval or Revision

Approved: Proceed with implementation
Needs revision: Address feedback and resubmit
Rejected: Choose from Options A-C instead

Requirements (All Custom Projects)

Your custom project must demonstrate:

1. Multiple Recursive Patterns (Minimum 3)

Choose from: - First + Rest: Processing sequences recursively - Divide and Conquer: Splitting problems in half - Tree Recursion: Multiple recursive calls per level - Backtracking: Exploring solution spaces with undo - Mutual Recursion: Functions calling each other recursively - Memoization: Caching recursive results

2. Substantial Complexity

Your project should involve: - Non-trivial base cases: Multiple stopping conditions - Complex recursive cases: More than simple linear recursion - Optimization challenges: Performance issues requiring memoization or pruning - Real-world applicability: Solves an actual problem, not just academic exercise

3. Progressive Development

Document at least 5 iterations: - Iteration 0: Naive implementation - Iterations 1-3: Address specific failure modes - Iteration 4+: Optimization and refinement

Each iteration must include: - What broke and why (with Python output) - Diagnostic analysis of the failure - Solution implementation - Verification that the fix worked

4. Comprehensive Testing

Include: - Unit tests: Base cases, recursive cases, edge cases - Performance benchmarks: Compare different approaches - Stress tests: Large inputs, deep recursion - Failure demonstrations: Intentionally broken versions

5. Professional Documentation

Provide: - README: Problem description, usage examples, design decisions - Complexity analysis: Time/space complexity with recurrence relations - Call stack analysis: Maximum recursion depth for different inputs - Trade-off discussion: When your approach works well vs. alternatives

Example Custom Project Ideas

Here are examples of appropriate custom projects (you may propose these or create your own):

Example 1: Recursive JSON Schema Validator

Problem: Validate JSON documents against complex schemas with nested structures.

Recursive Components: 1. Tree recursion: Traverse nested JSON objects/arrays 2. Mutual recursion: Different validators for different schema types 3. Backtracking: Try alternative schema interpretations

Complexity: Schemas can reference other schemas, creating recursive definitions.

MVP Features: - Validate primitive types (string, number, boolean) - Validate objects with required/optional properties - Validate arrays with item schemas - Support nested schemas - Report validation errors with paths

Example:

schema = {
    "type": "object",
    "properties": {
        "name": {"type": "string"},
        "age": {"type": "number"},
        "address": {
            "type": "object",
            "properties": {
                "street": {"type": "string"},
                "city": {"type": "string"}
            }
        }
    }
}

validator = JSONValidator(schema)
result = validator.validate(data)
print(result.errors)  # List of validation errors with paths

Example 2: Recursive Regex Matcher

Problem: Implement a regular expression matcher using recursive descent.

Recursive Components: 1. Mutual recursion: Different matchers for different regex constructs 2. Backtracking: Try alternative match paths 3. Memoization: Cache match results for substrings

Complexity: Regex patterns can be nested (groups, alternation, quantifiers).

MVP Features: - Literal character matching - Character classes ([abc], [a-z]) - Quantifiers (*, +, ?) - Grouping with parentheses - Alternation (|)

Example:

matcher = RegexMatcher(r"a(b|c)*d")
assert matcher.match("ad")
assert matcher.match("abd")
assert matcher.match("abcd")
assert matcher.match("abbbcd")
assert not matcher.match("aed")

Example 3: Recursive Dependency Resolver

Problem: Resolve package dependencies with version constraints.

Recursive Components: 1. Tree recursion: Each package has multiple dependencies 2. Backtracking: Try different version combinations 3. Memoization: Cache resolution results

Complexity: Dependencies can conflict, requiring backtracking to find valid combinations.

MVP Features: - Parse dependency specifications - Resolve transitive dependencies - Handle version constraints - Detect circular dependencies - Find valid version combinations

Example:

resolver = DependencyResolver()
resolver.add_package("A", "1.0", dependencies={"B": ">=2.0", "C": "^1.5"})
resolver.add_package("B", "2.1", dependencies={"C": "^1.0"})
resolver.add_package("C", "1.8", dependencies={})

result = resolver.resolve("A")
print(result.packages)  # {"A": "1.0", "B": "2.1", "C": "1.8"}

Example 4: Recursive Maze Generator and Solver

Problem: Generate random mazes using recursive division, solve using recursive backtracking.

Recursive Components: 1. Divide and conquer: Recursive maze generation 2. Backtracking: Path finding through maze 3. Tree recursion: Multiple paths to explore

Complexity: Mazes can be arbitrarily large and complex.

MVP Features: - Generate random mazes recursively - Solve mazes using DFS with backtracking - Find all paths from start to end - Find shortest path - Visualize maze and solution

Example:

maze = MazeGenerator(width=20, height=20)
maze.generate_recursive_division()
maze.print()

solver = MazeSolver(maze)
path = solver.find_path(start=(0, 0), end=(19, 19))
maze.print_with_path(path)

Example 5: Recursive Fractal Generator

Problem: Generate fractal images using recursive geometric transformations.

Recursive Components: 1. Tree recursion: Each branch spawns multiple sub-branches 2. Divide and conquer: Subdivide space recursively 3. Depth limiting: Control recursion depth for detail level

Complexity: Fractals have self-similar structure at multiple scales.

MVP Features: - Generate classic fractals (Sierpinski triangle, Koch snowflake, tree) - Parameterize recursion depth - Export as images - Animate fractal generation - Support custom transformation rules

Example:

fractal = FractalGenerator()
fractal.generate_tree(
    depth=8,
    branch_angle=30,
    branch_length_ratio=0.7
)
fractal.save("tree.png")

Proposal Template

Use this template for your custom project proposal:

# Custom Project Proposal: [Project Title]

## Student Information
- Name: [Your name]
- Email: [Your email]

## Problem Statement

[2-3 paragraphs describing the problem you're solving]

## Recursive Components

### Pattern 1: [Pattern Name]
- **Where used**: [Specific component]
- **Why recursive**: [Justification]
- **Base case**: [Description]
- **Recursive case**: [Description]

### Pattern 2: [Pattern Name]
[Same structure]

### Pattern 3: [Pattern Name]
[Same structure]

## Complexity Justification

[Explain why this problem requires recursion and can't be easily solved iteratively]

## Scope Definition

### MVP Features (Must Have)
1. [Feature 1]
2. [Feature 2]
3. [Feature 3]
...

### Stretch Goals (Nice to Have)
1. [Feature 1]
2. [Feature 2]
...

## Success Criteria

### Correctness
- [How you'll verify correctness]
- [Test cases you'll implement]

### Performance
- [Performance benchmarks you'll run]
- [Expected complexity]

### Code Quality
- [Documentation standards]
- [Testing coverage goals]

## Timeline

- Week 7: Proposal submission
- Week 8, Days 1-2: Iterations 0-2 (basic functionality)
- Week 8, Days 3-4: Iterations 3-4 (optimization)
- Week 8, Days 5-6: Testing and documentation
- Week 8, Day 7: Final submission

## Questions for Instructor

[Any questions or concerns about the project]

Evaluation Criteria (Custom Projects)

Your custom project will be evaluated using the same criteria as Options A-C:

Criterion	Weight	What We're Looking For
Correctness	30%	Solution works for all test cases, handles edge cases
Recursive Design	25%	Appropriate use of recursion, clear problem decomposition
Code Quality	20%	Readable, well-structured, properly documented
Analysis Depth	15%	Thorough complexity analysis, insightful trade-off discussion
Testing	10%	Comprehensive test coverage, meaningful benchmarks

Common Rejection Reasons

Projects are typically rejected if they:

Insufficient recursion: Problem can be easily solved iteratively
Too simple: Doesn't demonstrate mastery of multiple patterns
Too complex: Scope exceeds 15-20 hours of work
Unclear objectives: Success criteria not well-defined
Not original work: Copying existing implementations

Getting Help

If you're unsure whether your idea is appropriate:

Office hours: Discuss your idea with the instructor
Discussion forum: Post your idea for peer feedback
Draft proposal: Submit a rough draft for early feedback

Deliverables Checklist (Custom Projects)

[ ] Approved project proposal
[ ] Core implementation demonstrating all recursive patterns
[ ] At least 5 documented iterations showing progression
[ ] Comprehensive test suite (30+ tests)
[ ] Performance benchmarks comparing approaches
[ ] Complexity analysis with recurrence relations
[ ] README explaining problem, design, and usage
[ ] Example usage demonstrating all features
[ ] Reflection document discussing what you learned

Submission Guidelines and Grading

Submission Guidelines

What to Submit

Your final submission must include:

1. Source Code

All Python files implementing your project
Organized in a clear directory structure
Following PEP 8 style guidelines

2. Test Suite

Comprehensive unit tests
Performance benchmarks
Test data generators (if applicable)

3. Documentation

README.md: Project overview, setup instructions, usage examples
DESIGN.md: Design decisions, recursive patterns used, trade-off analysis
ITERATIONS.md: Documented progression through iterations with failure analysis
COMPLEXITY.md: Detailed complexity analysis with recurrence relations

4. Examples

Demonstration scripts showing all features
Sample inputs and expected outputs
Visualization examples (if applicable)

Directory Structure

Organize your submission like this:

final-project/
├── README.md                 # Project overview
├── DESIGN.md                 # Design decisions
├── ITERATIONS.md             # Development progression
├── COMPLEXITY.md             # Complexity analysis
├── requirements.txt          # Python dependencies
├── src/                      # Source code
│   ├── __init__.py
│   ├── core.py              # Core implementation
│   ├── utils.py             # Helper functions
│   └── visualize.py         # Visualization (if applicable)
├── tests/                    # Test suite
│   ├── __init__.py
│   ├── test_core.py         # Unit tests
│   ├── test_edge_cases.py   # Edge case tests
│   └── benchmarks.py        # Performance benchmarks
└── examples/                 # Usage examples
    ├── basic_usage.py
    ├── advanced_usage.py
    └── demo.py

Code Quality Standards

Your code must meet these standards:

1. Type Hints

Use type hints for all function signatures:

def find_best_move(
    state: GameState,
    depth: int = 0,
    alpha: float = float('-inf'),
    beta: float = float('inf')
) -> Tuple[Optional[Move], float]:
    """Find optimal move using minimax."""
    pass

2. Docstrings

Document all public functions and classes:

def minimax(
    state: GameState,
    depth: int,
    maximizing: bool
) -> float:
    """
    Recursive minimax algorithm.

    Args:
        state: Current game state
        depth: Current search depth
        maximizing: True if maximizing player's turn

    Returns:
        Evaluation score for the position

    Base Cases:
        - Terminal state (game over)
        - Maximum depth reached

    Recursive Case:
        - Try all legal moves
        - Recurse on resulting states
        - Return best score for current player

    Time Complexity: O(b^d) where b is branching factor, d is depth
    Space Complexity: O(d) for call stack
    """
    pass

3. Comments

Explain complex recursive logic:

def merge_intervals(intervals: List[Interval]) -> List[Interval]:
    """Recursively merge overlapping intervals."""
    # Base case: 0 or 1 intervals
    if len(intervals) <= 1:
        return intervals

    # Recursive case: process first interval
    first = intervals[0]
    rest = intervals[1:]

    # Check if first overlaps with any in rest
    for i, interval in enumerate(rest):
        if first.overlaps(interval):
            # Merge and recurse on remaining
            merged = first.merge(interval)
            remaining = rest[:i] + rest[i+1:]
            return merge_intervals([merged] + remaining)  # ← Recursive call

    # No overlap: first is separate
    return [first] + merge_intervals(rest)  # ← Recursive call

4. Error Handling

Handle edge cases gracefully:

def analyze_directory(path: Path) -> DirectoryStats:
    """Recursively analyze directory."""
    try:
        if not path.exists():
            raise ValueError(f"Path does not exist: {path}")

        if not path.is_dir():
            raise ValueError(f"Path is not a directory: {path}")

        # Main logic here

    except PermissionError:
        logger.warning(f"Permission denied: {path}")
        return DirectoryStats(path)  # Return empty stats

    except OSError as e:
        logger.error(f"OS error analyzing {path}: {e}")
        return DirectoryStats(path)

Documentation Requirements

README.md Template

# [Project Title]

## Overview

[2-3 paragraphs describing what your project does]

## Features

- Feature 1
- Feature 2
- Feature 3

## Installation

```bash
pip install -r requirements.txt

Quick Start

from src.core import [YourClass]

# Basic usage example
example = [YourClass]()
result = example.solve()
print(result)

Usage Examples

Example 1: [Description]

# Code example

Example 2: [Description]

# Code example

Testing

Run the test suite:

pytest tests/

Run benchmarks:

python tests/benchmarks.py

Performance

[Include benchmark results, complexity analysis summary]

Design Decisions

[Brief summary, link to DESIGN.md for details]

License

[Your choice of license]

#### ITERATIONS.md Template

```markdown
# Development Iterations

## Iteration 0: Initial Implementation

### Goal
[What you were trying to achieve]

### Implementation
```python
# Initial code

Result

[What happened when you ran it]

Limitations

[What didn't work or could be improved]

Iteration 1: [Improvement Description]

Problem Identified

[Specific failure mode from Iteration 0]

Python Output

[Complete error message or incorrect output]

Diagnostic Analysis

Error message interpretation: [What the error tells us]

Call stack analysis: [What the traceback reveals]

Root cause: [Why the code failed]

Solution

[What technique you applied]

Implementation

# Improved code

Verification

# Test showing it now works

Result

[Evidence the improvement worked]

[Repeat for each iteration]

#### COMPLEXITY.md Template

```markdown
# Complexity Analysis

## Time Complexity

### Recurrence Relation
[Mathematical recurrence relation]

### Solution
[How the recurrence solves to Big-O]

### Explanation
[Intuitive explanation of why this complexity]

## Space Complexity

### Call Stack Depth
[Maximum recursion depth]

### Additional Space
[Any extra data structures used]

### Total Space
[Overall space complexity]

## Empirical Analysis

### Benchmark Results

| Input Size | Time (s) | Nodes Explored | Call Stack Depth |
|------------|----------|----------------|------------------|
| 10         | 0.001    | 45             | 5                |
| 100        | 0.015    | 523            | 8                |
| 1000       | 0.234    | 6,847          | 12               |

### Complexity Verification

[Graph or table showing empirical complexity matches theoretical]

## Optimization Impact

### Before Optimization
- Time: [complexity]
- Space: [complexity]

### After Optimization
- Time: [complexity]
- Space: [complexity]

### Improvement Factor
[Quantitative improvement]

Submission Process

Step 1: Self-Review Checklist

Before submitting, verify:

[ ] All code runs without errors
[ ] All tests pass
[ ] Code follows PEP 8 style guidelines
[ ] All functions have type hints and docstrings
[ ] README includes clear usage examples
[ ] ITERATIONS.md documents at least 5 iterations
[ ] COMPLEXITY.md includes recurrence relations
[ ] Performance benchmarks are included
[ ] No placeholder comments like "TODO" or "FIX THIS"

Step 2: Create Submission Archive

# Create a zip file of your project
cd final-project/
zip -r final-project-[YOUR_NAME].zip . -x "*.pyc" -x "__pycache__/*" -x ".git/*"

Step 3: Submit via Course Platform

Upload your zip file to the course submission system by the deadline.

Grading Rubric

Your project will be graded using this detailed rubric:

Correctness (30 points)

Criterion	Points	Description
Core functionality works	15	All MVP features implemented and working
Edge cases handled	8	Proper handling of empty inputs, errors, boundary conditions
Test coverage	7	Comprehensive test suite with >80% coverage

Recursive Design (25 points)

Criterion	Points	Description
Appropriate recursion use	10	Recursion used where it clarifies solution
Multiple patterns demonstrated	8	At least 3 distinct recursive patterns
Base cases correct	7	All base cases identified and handled properly

Code Quality (20 points)

Criterion	Points	Description
Readability	8	Clear variable names, logical structure, appropriate comments
Documentation	7	Type hints, docstrings, README, design docs
Style	5	Follows PEP 8, consistent formatting

Analysis Depth (15 points)

Criterion	Points	Description
Complexity analysis	7	Correct recurrence relations and Big-O analysis
Trade-off discussion	5	Thoughtful comparison of approaches
Iteration documentation	3	Clear progression through failures and fixes

Testing (10 points)

Criterion	Points	Description
Unit tests	5	Comprehensive coverage of functionality
Performance benchmarks	3	Meaningful performance comparisons
Edge case tests	2	Tests for boundary conditions and errors

Late Submission Policy

Within 24 hours: -10% penalty
Within 48 hours: -20% penalty
Within 72 hours: -30% penalty
After 72 hours: Not accepted (must retake course)

Academic Integrity

This is an individual project. You may: - ✅ Discuss high-level approaches with classmates - ✅ Use course materials and lecture notes - ✅ Search for general recursion concepts online - ✅ Use Python standard library and common packages

You may NOT: - ❌ Copy code from classmates or online sources - ❌ Use AI code generation tools (ChatGPT, Copilot, etc.) - ❌ Submit work done by someone else - ❌ Share your code with other students

Violations will result in a zero on the project and possible course failure.

Getting Help

If you're stuck:

Review course materials: Revisit relevant chapters and examples
Office hours: Attend instructor office hours for guidance
Discussion forum: Post questions (without sharing code)
Debugging guide: Follow the systematic debugging process from Chapter 8.4

Frequently Asked Questions

Q: Can I use external libraries? A: Yes, but your core recursive logic must be your own implementation. Document any external libraries in requirements.txt.

Q: How much code is expected? A: Quality over quantity. A well-designed solution might be 300-500 lines of core code plus tests. Focus on clarity and correctness.

Q: Can I exceed the recursion limit? A: Your solution should work within Python's default recursion limit (1000). If you need more, justify it in your documentation and use sys.setrecursionlimit() carefully.

Q: What if my project is too ambitious? A: Focus on the MVP first. Implement stretch goals only if time permits. A complete MVP is better than an incomplete ambitious project.

Q: Can I change projects after approval? A: Only with instructor permission and only if you have a compelling reason. Changes after Week 7 are generally not allowed.

Q: How do I demonstrate "mastery"? A: Show deep understanding through: - Correct implementation of multiple recursive patterns - Thoughtful analysis of trade-offs - Systematic debugging and iteration - Clear explanation of design decisions

Final Checklist

Before submitting, ensure you have:

[ ] Chosen a project option (A, B, C, or approved D)
[ ] Implemented all MVP features
[ ] Documented at least 5 iterations with failure analysis
[ ] Written comprehensive tests (30+ test cases)
[ ] Included performance benchmarks
[ ] Completed complexity analysis with recurrence relations
[ ] Written clear documentation (README, DESIGN, ITERATIONS, COMPLEXITY)
[ ] Verified all code runs without errors
[ ] Created submission archive
[ ] Submitted via course platform before deadline

Good luck! This is your opportunity to demonstrate everything you've learned about recursive problem-solving. Make it count!